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In this paper we propose a simple and efficient strategy to obtain a data structure 
generator to accomplish a perfect hash of quite general order restricted multidimen- 
sional arrays named phormas. The constructor of such objects gets two parameters 
as input: an n-vector a of non negative integers and a boolean function B on the 
types of order restrictions on the coordinates of the valid n- vectors bounded by a. At 
compiler time, the phorma constructor builds, from the pair a, B, a digraph G(a, B) 
with a single source s and a single sink t such that the st-paths are in 1 — 1 corre- 
spondence with the members of the -B-restricted a-bounded array A[a, B). Besides 
perfectly hashing A(a,B), G(a,B) is an instance of an NW-iaxmly. This permits 
other useful computational tasks on it. 
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1 Motivation and objective 

•rH . 

This work introduces a new type of data structure generator named phorma, 
P = (a,B), which consists of a positive integer n-vector a and a boolean 
function B whose literals are order restrictions on the components of the 
n-vectors a dominated by a, that is < a«, i = 1,2, ... ,n. The simplest 
example of phorma arises in the need to store a symmetric (p x g)-matrix. In 
this case the phorma is P|* m = (a = (p,q), B = a± > a^)- Our basic goal is 
to enumerate in an efficient way all the equivalence classes of indices given 
that the matrix is symmetric. The work that motivates phormas, and where 
appears its first real use is [4]. Trying to avoid duplicates in the huge set of of 
equivalences classes of indices of some 3-dimensional matrices, we were led to 
implement the phormas: P| 4m = (a = {p,q,r),B = («i > a 2 ) V (ct2 > ct 3 )) 
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and P^ 2 = (a = (p, q,r),B = (a 1 > a 2 )). The first phorma arises when there 
are symmetries permuting arbitrarily all the three coordinates. The second, 
when the first and second coordinates can be interchanged, but the third is 
held fixed. These phormas play a crucial role in the algorithms of [4]. 

To better motivate the concept and to help the reader to grasp the definition 
of the general problem we treat, we discuss at length an example of phorma 
(a less trivial one) arising in packing rectangles into rectangular and L-shaped 
pieces [5]. An L-shaped piece is a rectangle R from which we have removed 
a smaller rectangle r C R. Moreover R and r have a corner in common. By 
effecting rotations, translations and reflections we may suppose that our L 
shaped piece has a corner in the origin and the common vertex to r and R is 
the vertex opposite to the origin in rectangle R. Positioned in this canonical 
way, the L-piece is represented by a quadruple of real numbers (01O2O3O4) , 
with a.\ > 03 and 02 > 04, where the big rectangle R has diagonal from (0, 0) 
to (oi,o 2 ) and the smaller rectangle r has diagonal from (03,04) to {0.1,0.2). 
Let a = aia 2 a 3 a4 be a positive integer 4-vector with a\ > a 3 ,a 2 > a 4 . In [5] 
we need to enumerate the canonically positioned L-shaped pieces with integer 
coordinates a < a, that is, the L-pieces a = 01020304 with (1) a± > a% and (2) 
a 2 > 04 are dominated by a, Oi < ai, i — 1,2,3,4. Symmetry considerations 
enable us to partition the set of a-bounded L-pieces into equivalent classes 
and to distinguish a set A of representatives for these classes. 

For our occupancy purposes in [5] the L-pieces 01020304 and 0:201 040:3 must 
be considered equivalent: one such L piece is transformed into the other by a 
reflection along the line passing through the origin and having slope 1. This 
is simply an axis interchange. With this in mind we have the following order 
restrictions for a representative of an equivalence class: (3) 0\ > o 2 , otherwise 
we could use 0:2010:40:3. Also, (4) 01 = 02 =>• 03 > 04, otherwise we could 
use a2«i«4«3 again. In terms of occupancy, 01020104 with 04 < o 2 , which 
is a degenerated L, can (and must) be replaced by the rectangle 01020102. 
Analogously, ai0 2 o 3 a2 with o 3 < Oi can be replaced by oia 2 oio 2 . In this 
way, the equivalence Oi = 03 -v^ 02 = 04 holds. The equivalence is rewritten 
as two opposite implications in the disguised form: (5) ((01 7^ 03) V (02 = 04)) 
and (6) ((02 7^ o 4 ) V (01 = 03)). The restrictions (1) to (6) are gathered in 
a boolean expression B L : 

B L = (oi > o 3 ) A (a 2 > 04) A (oi > o 2 ) A ((oi 7^ o 2 ) V (o 3 > o 4 )) A 
((01 7^ o 3 ) V (o 2 = o 4 )) A ((o 2 7^ 04) V (01 = o 3 )). 

In general, a phorma, or a perfectly /lashed order restricted multidimensional 
array, is a pair P = (a, B) where a is an n-vector of positive integers and 
for a a positive integer n-vector dominated by a, B is a boolean function 
whose literals are of type (oj * Oj), where * G {<,>,<,>, =,7^}- The set 
A = A(P) = A(a,B), (of representative of the classes in the case of the 
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L-pieces), is formed by the cu's dominated by a and satisfying B. 

Our objective in this work is given any phorma P = (a, B) to produce a 
constructive bijection h between A = A(P) and {0,1,..., \A\ — 1}, so that 
both h and h" 1 are efficiently computable. Such functions are called perfect 
hash functions [3], [2] and their usefulness is well known in computer science. 

As far as we know the problem of finding perfect hash functions for these 
quite general multidimensional arrays have not been considered before in the 
literature, whence the lack of more specific references and bibliography. Our 
solution is based in the theory of Nijenhuis and Wilf, chapter 13 of [6]. Their 
iVW-combinatorial families associates a digraph to a set of combinatorial ob- 
jects in such a way that an object is in 1 — 1 correspondence with a path in 
the digraph. See also a more detailed account of these combinatorial families 
in Wilf's book [7], available at his page in the internet. A phorma is a partic- 
ular case of A/W-combinatorial family, specialized in boolean order specified 
multidimensional arrays. Their intrinsic structure permits us to accelerate, as 
we show in the final section, the calculus of h(a) and h~ 1 {w). 



2 The (m, n)-patterns 

For m G {1, 2, . . . , n} — N, an (n, m)-pattern (3 = fiify • • • /3 n is a sequence of 
length n in which each of the m symbols 1, 2, . . . , m occurs at least once. Given 
a phorma (a, B) and a G A = A(a, B) with m a distinct entries there exists a 
unique (n, m Q )-pattern, denoted by (3 a , which is order compatible with a: for 
i G N, if cti is the k-th smallest entry among the ones appearing in a, then 
define (3f = k. As some examples, consider the phorma (a = 7575, B L ), where 
B L appears in the previous section. We have /3 7412 = 4312, p 5521 = 3321, 
^5533 = 2211, (3^33 = im Let the set £ = £(P) = C(a, B) of all (n,m)- 

patterns induced by a G A(a, B), 

C = C{P) = C(a, B) = {/T I a G A(a, B)}) = (J3\ H\ . . . , (3 q ), 

be given by a list in lexicographical order. The list C induces a partition of A 
in q parts: indeed, defining [ft] = {a G A \ (3 a = ft}, we get A = \J 9 j=i[ft], 
with [ft] n \(3 k ] = if j ^ k. For the phorma P 7 L 575 = (a = 7575, B L ) we get 

C L = C(P 7 L 575 ) = (1111,2121,2211,3211,3221,3321,4231,4312,4321). 

There are only mild restrictions on the subset C: its cardinality, q, should be 
small enough in order for the /3 5 ''s to be kept in core; also, £ should have enough 
structure to be effectively generated by an implicit enumeration scheme. In 
the implementation of a phorma (a, B) , the first task of the constructor of the 
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data structure [1] phorma (which is activated at compiler time) is to obtain the 
list L. How this is done? In many applications the dimension n of the phorma 
is small enough for trying all n n sequences of length n in symbols 1,2, ... ,n, 
choose the ones which are (m, n)-patterns and test for .^-satisfiability [2]. The 
sequences that survive are added to C. In the above case n n = 256 and this 
simple minded approach is convenient. In some cases it is more efficient to use 
appropriate A^VF-combinatorial families [6], [7] which generate only (m,n)- 
patterns. Here we avoid details of these specific families. In other cases, the 
list L is obtained by implicit enumeration. In any case, testing ^-satisfiability 
is unavoidable and is the computational bottleneck for the phorma constructor 
in obtaining C. 



3 The digraphs # 7 's, H a and G(P) = G(a, B) 

Throughout this work 7 = 7x72 ... 7 m is an strictly increasing m-sequence, 
m < n, with entries in N. For each a E A, let 7° denote the strictly increasing 
sequence of length m a of the m a distinct entries appearing in a±a 2 ■ ■ ■ a n . 
Observe that a is recoverable from (induced by) the pair {f3 a , r ) a ). The a so 
induced by {(3, 7), where (3 is an (m, n)-pattern and 7 is an strictly increasing 
m sequence with entries in N , is denoted ct*(/3, 7). As examples, in the phorma 
P 7 L 575 , a *(3221,457) = 7554, a*(4321,3457) = 7543, a*(4231,4567) = 7564. As 
we shall see, the simple correspondences a — * (/3 a , 7") and its inverse, {(5, 7) — > 
cx*(/3, 7), inducing a (/?, 7), are central for the efficient implementation of 
the hash function h and its inverse. 

For p e C the (a, (3) -maximal increasing sequence, denoted by 7*(a, f3) = 
7i72 • • • 7m 5 1S ^ ne strictly increasing sequence of length m satisfying the follow- 
ing conditions: suppose that, for 1 < % < m, i occurs at positions p a , . . . ,p i j i 
of f3; recall that a = a x a 2 ...a n and define 7^ = min{a Pml , a Pm2 , . . ., a Pmjm } 
and for % = m — 1, m — 2, . . . , 1, 7*=min{ a Pil , . . ., a Pij , , 7* +1 — 1}. Observe 
that 7*(a, 0) can alternatively be defined as the lexicographically maximal 7 
such that a* {13, 7) G A. 

Having constructed the list C = C{a,B) = /3 2 , . . . , (3 q ), the next task 
for the phorma constructor is to obtain a corresponding list Y = F(a, C) = 
(7*(a, f3 l ), 7*(a, f3 2 ),. . .,7*(a, (3 q )). As an example to help the understanding 
of how to obtain V, consider its construction for the phorma Pj 575 . We get 

r^ 575 = r(P 7 L 575 ) = (5,57,45,457,457,345,4567,3457,3457). 

Suppose 7 = 7172 ... 7 m is an increasing m-sequence with entries in the set 
of positive integers. We want to define a digraph iJ 7 . If 7„ t > m let ^7 



4 



denote the increasing sequence of length m satisfying ( <_ 7) m = 7m — 1 and 
(^7)i = min{(^7) i+ i -l,7j}, for % = 1, 2, . . . , m- 1. If 7 m = m, then ^7 does 
not exist. If 7 ^ t, let ^7 be the sequence of length m — 1 obtained from 7 by 
removing its last entry: ^7 = 71 . . . 7 TO _i. If 7 = t, then ^7 does not exit. Given 
7, 7 € r, we say that 7 ^ 7, if there is a sequence (7 = 7 1 , 7 2 , . . . , 7 P = 7), 
with 7* G T, such that, for each i = 1,2, . . . ,p — 1, either 7* +1 =<_ (7 l ) or 
else 7* +1 =^.(7*). The relation ^ is a partial order in the set Foo, of all finite 
increasing sequences with integer entries. For 7 G Fqo, let if 7 be the acyclic 
digraph whose vertex set is VC-y = {7 | 7 ^ 7}. The empty increasing sequence 
is considered a member of F^. It corresponds to a terminal vertex (the unique 
sink), and so, is represented by t. From each vertex 7 G VC 1 there are at most 
two outgoing edges whose heads are ^7 (if it exists) and (if it exists). These 
are all the edges, what concludes the definition of if 7 . In Fig. 1 we show all the 
graphs H n i, j = 1, 2, . . . , 9, corresponding to r5 575 . Since 7*(a, /3 4 ) = 7*(a, (3 b ) 
and 7*(a,/3 8 ) = 7*(a,/3 9 ) we get only seven distinct digraphs. In picturing 
them, the direction of the edges are implicit. They go from higher vertices to 
lower ones and in the case of a draw, the direction is from right to left. 
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Fi gure 1: Digraphs H§, H^, H§j, H^^j, H^qi 



The digraph H a is defined as the ^-indexed union of digraphs H^s: 

H a = \J{H^ M I (3 G £} 
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The digraph of the phorma, P = (a,B) is G(a,B) = H a {jA(a,B), where 
digraph A(a, B) consists of a root s linked to vertices labelled by ft, j = 
1,2, ... ,q. Each vertex ft is of valency 2. The edge from s enters it and there 
is an edge from it to the vertex of H a labelled by 7*(a,/3- ? ) = 7 J . The total 
number of edges of A (a, B) is 2q, finishing its definition. This also concludes 
the definition of the digraph G(a,B). In Fig. 2 we show G(P 7 L 575 ) = G 7575 . 
Observe that A(a, B) is depicted in dashed gray edges. The numbers on gray 
are important in the computation of h(a) and are explained in the next section. 




Figure 2: Digraph Gf 575 = G(7575, B L ) of the phorma P 7 L 575 = (7575, B L ) 



4 NW-Combinatorial Families 



We briefly recall the general concept of an A^V^-combinatorial family. An ex- 
ample of such an object is the digraph i/ 7 . The combinatorial family that it 
encodes is formed by the strictly increasing m-sequences 7 with entries in 
dominated by 7, 7^ < 74, i G N. Also the digraph G(a,B), for any phorma 
(a,B), is an AVF-combinatorial family. 
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The following concept, introduced in [6], is the central tool for this work. 
A Nijenhuis- Wilf combinatorial family or NW-family is a digraph G, whose 
vertex set is denoted by V(G), having the properties below: 

• V(G) has a partial order (for x, y G V(G), y z< x if there is a directed path 
from x to y) with a unique minimal element t. For each v G V(G) the set 
{x G V(G) \ x ^ v} is finite and includes t. 

• Every vertex v, except t has a strictly positive outvalence p(v). For each 
v G V(G), the set of outgoing edges has a local rank- label t v , < 
f„(e)<p(«)-l,ee£(D). 

Every directed path in G, starting from a vertex v and ending at t is called a 
combinatorial object of order v. Thus, the set of objects of order v is identified 
with the vertex v. Denote by \v\ the cardinality of the set of objects of type v, 
namely, the number of paths from v to t. In Fig. 2, the values \v\ are shown as 
a gray number next to vertex v and is the first of the two gray numbers in the 
case that v = ft . The significance of the second gray numbers associated with 
the ft's in Fig. 2 are explained in the final section. The local rank-labels of the 
outcoming edges at s in the iVW-combinatorial family G(a, B) is given by the 
lexicographical order of their heads ft . The unique outcoming edge at ft has 
local rank-label 0. The local rank-labels of the outcoming edges at a vertex 
7 of H y * is for the west edge (the one with head ^7), if it exists, implying 
1 for the southwest edge (the one with head Of course, if a vertex has 
only its southwest edge, the local rank-label of this edge is 0. Even though 
in the drawings the edges arriving at t are not in the southwest direction (to 
decrease the width of the figures), all of them are considered southwest edges. 

From the definitions we get immediately a recursive formula for \v\: \v\ = 
J2{\head(e)\ for the edges e | tail(e) = v}. This recursive formula follows from 
the fact that a path from v to t is an outgoing edge from v followed by 
a path representing a combinatorial object of smaller order. Therefore, the 
role of the graph G defining the NW-combinatorial family is to display how 
the combinatorial elements of the various orders are inductively formed. The 
usefulness of the notion of combinatorial family is that (i) a great number of 
usual combinatorial objects can be encoded as paths in an NW-family; (ii) 
the local rank-labels of the outcoming edges induce a unique ranking h of the 
combinatorial objects of order v. With respect to this ranking the following 
four tasks become computationally simple and as cheap as they can be. The 
tasks are exemplified and described in terms of the paths in digraph G, without 
mentioning the specific combinatorial families that G encodes. More details of 
the algorithms to perform these tasks can be found in Chapter 13 of [6]. 

Task 0: counting: What is the cardinality of the family? Algorithm: As we have 
mentioned, \v\ = J2{\head(e)\ | e G E(G),tail(e) = v}. It is then possible for 
the constructor of the phorma to obtain the value of each \v\ by recursion and 
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to store it as an attribute of v G V(G) in a pre-processing phase (compilation 
time). For instance, for the phorma Pj 575 has cardinality 190. This is the value 
of \s\, in Fig. 2. 

Task 1: sequencing: Given an object in the family, construct the "next" object. 
Algorithm: A path starting at v and ending in t is encoded by the sequence 
of label-ranks of the sequence of its edges. The next path of a given path n 
in coded form is, in coded form, the lexicographic successor of n. In coded 
form the 7 paths from the vertex v of the A^P^-combinatorial family H^* 
of Fig. 3 are: rank -> 00000, rank 1 -> 01000, rank 2 -> 01100, rank 
3 -> 0111, rank 4 -> 10000, rank 5 -> 1101, rank 6 -> 111. In Theorem 1 
we shall see that these paths are in 1 — 1 correspondence with the sequence 
of 7's (123,124,134,234,125,135,235). This is the sequence, in rank order, 
of all strictly increasing sequences of length 3 in {1,2,3,4,5} dominated by 
7* = 235. 



V V V V V V V 




Figure 3: All paths from v = 7* = 235 to t in H235 



Task 2: ranking (perfect hashing): Given an object u in the family, find the 
integer h(u) such that u> is the h(u)-th element in the order induced by 
task 1. Algorithm: Let an element-path n of order v of an NW-family, n = 
(ex, 62, ■ ■ ■ , e p ) be given. The rank of % is defined as h(ir) = Y$=i x( e i)i where 
X(e) = E{\head(f)\ with l v {f) < 4(e),/ G E(v)}. In the A^Vy-combinatorial 
family, this formula for n is particularly simple: the value h(n) is obtained 
as sum of the orders of the post-falls of n (defined in the beginning of next 
section). In Fig. 3, the post-falls of the paths are the white vertices. 

Task 3: unranking: Given an object integer r construct the r-th member of the 
family. Algorithm: Given an integer r, we need to construct the r-th path from v 
to t. Consider pred v (e) as the highest-rank edge of the set {/ G E(v) \ £ v (f) < 
4(e)}, and let \head(pred v (e))\ = if this set is empty. The required r-th 
path's 7r r is generated as follows: 7r r <— 0; r' <— 0; v' <— v; repeat append 
to 7r r the highest-rank edge e of E(v') such that r' + \head(pred v '(e))\ < r; 
r' <— r' +\head{pred v '(e))\] v' head(e) until v' = t. It should not be difficult 
to check this unranking algorithm in the paths of Fig. 3. 

Task 4: getting random object: Choose an object uniformly at random from the 
given family. Algorithm: Let £ G [0, 1] be uniformly chosen at random; return 
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the (\v | * £)-th object. 



5 1 — 1 Correspondences 



Let 7r be a path which starts at 7* and finishes at s. A fall of n is the tail of 
a southwest edge, thus n has m falls, where m is the length of 7*. A post-fall 
in 7r is the vertex which is the head of an edge whose tail is a fall. Path 7r has 
at most m post-falls. 

Theorem 1 The st-paths in digraph H^* are in 1 — 1 correspondence with the 
strictly increasing m-sequences with entries in N which are dominated by 7*. 

Proof: Any such path % is in 1 — 1 correspondence with its sequence of 
falls (7 m , . . . , 7 2 , 7 1 ). Note that 7 J (j = 1,2, ... ,m), is the last vertex of % 
whose defining sequence has length j. Let 7J = 7-, j = 1,2, ... ,m. Clearly 

lj="fj < 7j , and 7"" is dominated by 7*. Reciprocally, given a 7 dominated 
by 7*, construct a 7r 7 = (7™, . . . ,7 2 ,7 1 ) so that starting from vertex 7*, the 
last vertex whose defining sequence has length j is 7 J defined when we impose 
the equality 7] = jj. With these definitions, it follows that = ir, proving 
the Theorem. ■ 



To exemplify the above inverse constructions, consider the path 7r from 7* = 
8CFJ to t in H 8CF j (subscript in base 20 : A — 10, B — 11,..., J — 19) 
defined by the sequence of its falls (8CDE, 567, 34, 3). Path ir induces 7^ 
given by the last digits of the falls in reverse order: 7^ = 347E. Reciprocally, 
given 7 = 347 E, starting at 8CFJ the last digit of the first fall of the path 7r 7 
that we seek is the fourth digit of 7. Thus, we must go J — E = 5 steps to the 
left to arrive at 8CDE, defining the first fall of 7r 7 . Following the southwest 
edge we arrive at 8CD. We know that the last digit of the second fall of 7r 7 
must be 7 (the third digit of 7). Thus we must go D — 7 = 6 steps to the left 
arriving at the second fall 567. Go southwest, arriving at 56. The last digit of 
the third fall is the second digit of 7, namely 4. We must go 6 — 4 = 2 steps 
left arriving at the third fall 34. Go southwest, arriving at 3. The last digit of 
the fourth fall is the first digit of 7, namely 3. We must go 3 — 3 = steps 
left to get the fourth fall of 7r 7 , namely 3. In this way, from 7 and 7* we have 
obtained the sequence of falls (8CDE, 567, 34, 3). This sequence of falls define 
7T 7 . Clearly, ir^^ = n. 
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1234 2345 3456 4567 5678 6789 789A 89AB 8ABC 8BCD 8CDE 8CEF 8CFG 8CFH 8CFI 8CFJ 




Figure 4: Digraph #8,12,15,19 (vertex labels in base 20) 



Theorem 2 For any phorma P = (a, B), the st-paths in digraph G(a, B) are 
in 1 — 1 correspondence with A(a, B). 

Proof: Let tt be an si-path in digraph G(a, B) and (s, 7 1 , 7 2 , . . . , j p , t) be 
the sequence of vertices in tt. By Theorem 1, the subpath from 7 1 to t which 
is in H^i defines a 7"" dominated by 7 1 . The a which corresponds to tt is 
a*{(5^ , 7*'). Reciprocally, given a G A(a, B), consider the pair ((3 a ,j a ). Let tt 1 
be the path in H^*^^ from j*(a, a ) to t which corresponds to (3 a , given by 
Theorem 1. The si-path tt in G(a, B) which corresponds to a is obtained from 
tt' by pre-fixing to it the two edges, from s to (3 a and from f3 a to 7* (a, /3 a ). The 
correspondences 7r 1— > a and a t— > 7r are inverses establishing the Theorem. ■ 



6 Implementation Issues 



Entering a generic phorma type boolean function B. A convenient 
way to store such boolean functions is by means of a tree T(B) with three 
types of internal nodes: V-nodes,A-nodes, -1— nodes. The leaves of the tree 
correspond to the basic constituent boolean functions of type ot,i * aj, where 
* G {<,>,<,>, = ,7^}- The -1— nodes (negation operator) must have at most 
one child. Note that each subtree rooted at an internal o-node v (o G { V, A, ->}) 
is a boolean tree obtained by taking the o-operation of the boolean tree(s) 
corresponding to the children of v. Given an a it is rather easy to decide B- 
satisfiability of a, by evaluating from the leaves up and arriving to the root of 
T(B). 

Properly storing H a . Let a* be the maximum of the a^'s. Consider a bidi- 
mensional array R[0..n, l..a*], in which cell R[m,p] contains the address of a 
simple linked list containing in 7-lexicographical order all the pairs (/7, 17I) 
in which 7 is a vertex of H a , and, as a sequence, has length m and satis- 
fies 7 m = p. The need to use the pairs (yj, \ become clear to efficiently 
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perform the rank operation, as explained below. The maximum length of the 
list R[m,p], denoted by |i?[ra,p]| is \C(a, B)\, however, these lengths tends 
to be very small. In particular, if all the entries of a are equal, or if p < 2, 
|i?[m,p]| = 1. In the example of Fig. 2, the only entry |i?[m,p]| which is not 
1 is |i?[4,7]| = 2. Graph H a is stored as a hash table R[0..m, I. .a*] in which 
the pairs (yy, having 7 with the same length m and the same last ele- 
ment p are stored together in a 7-lexicographically ordered list (to resolve the 
conflicts). We consider that a binary search in the list R[m,p] to locate the 
specific pair (yj, is good enough. 

Ranking in the NW- Family H^* . In order to obtain the rank of 7 G H^* of 
length m based in a usual pointer implementation ([2]) of H a we may need to 
walk along a path 717 of length a* + m, where a* = max{aj | z = 1,2,..., n}. 
By using the above hash table to store H a , we do the job in m steps. This is 
a critical speeding up improvement, since in most applications a* » m. Let 
/i 7 *(7) denote the rank of 7 in H^*. Let m be the length of 7 and 7 m , . . . , 7 2 , 7 1 
the sequence of falls of 7r 7 . We know that /i 7 *(7) is the sum of the orders of the 
corresponding post-falls, | < "7 m | + |^7 m_1 | + . . . + |^7 2 | + |^7 1 |- In this rank 
formula, if ^7* does not exists then | <_ 7 l | is defined as 0. Let ^^7 be denoted 
by 2<_ 7, < ~^" < "7 be denoted by 3 ^7, etc; also °^7 = 7. If y m > m + j, then ■ ?< ~7 
exists and is given by (^7)1 = min{7 m — j — m + i, 7,}, for i = 1,2, ... ,m. 
The 7*'s can be found as follows: Let £ m = 7^ — 7 m and 7 m = ^ m ^~(7*). For 
% = m — 1, m — 2, . . . , 2, 1 let & = 7* +1 — 7, and 7* = ?i ^(/7* +1 ). Since 
we need only | <_ 7 m |, |^7 m_1 |, • • • , I* - ! 2 !? I^T 1 !; it is enough to store, for each 
vertex 7 e ^(i/ 61 ), the pair of entries (7, |7|). All such pairs with 7 of length 
m and 7 m = p are stored in the list R[m,p], ordered lexicographically by 7. 
Since all 7's in the pairs (7, |7|)'s stored at R m {p) satisfy 7 m = p, we may 
drop the last entry of 7 and store the pairs (^7, |7|). 

Ranking in the NW- Family G(A,L). For a vertex v with at most one 
incoming edge e v of an NW-family, let ||t>|| = xi e v)- This is the case of s 
and of (5 in C{a,B). The value of ||s|| is zero and the values of ||/3||'s are 
pre-computed for each (5. Translating from the general recipe for ranking in 
an NW-family to our specific case, 



h(a) = \\s\\ + + V(«,/3-)(7 a ) = \\P a \\ + V(a,/3«)(7 Q )- 



/3 a is located by a binary search on £(a, B). From the pair (a, f3 a ) we compute 
7*(a, /9 Q ) which is the entry point in H a of the path 7r a . 
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7 Conclusion 

We have defined a data structure generator which permits the perfect hash 
of order restricted multidimensional arrays A(a,B). The restrictions accord 
a very general type of boolean functions B formed by order restricting pairs 
of entries of the array in arbitrary ways. Our scheme is the conjunction of 
two ideas: (1) To make a list C(a, B) of the order patterns which induce a 
partition of A(a,B). (2) Distinct patterns which have the same increasing 
sequences of symbols are treated together. In consequence, an n-vector a G 
A(a, B) is subdivided into two pieces of information (3, the pattern associated 
to a and 7), the increasing sequence of distinct symbols appearing in a. This 
encoding has the power of perfectly addressing huge arrays A(a, B) by means 
of logarithmically smaller digraphs G(a, B) (iWF-combinatorial families). This 
general type of perfect hash scheme does not seem to have been treated before 
in the literature. In particular, its applications to database systems is a possible 
source of relevant applications and remains to be investigated. 
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